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Abstract. Source separation is one of the signal processing's main emerging 
domain. Many techniques such as maximum hkeUhood (ML), Infomax, cumu- 
lant matching, estimating function, etc. have been used to address this difficult 
problem. Unfortunately, up to now, many of these methods could not account 
completely for noise on the data, for different number of sources and sensors, for 
lack of spatial independence and for time correlation of the sources. Recently, 
the Bayesian approach has been used to push farther these limitations of the con- 
ventional methods. This paper proposes a unifying approach to source separation 
based on the Bayesian estimation. We first show that this approach gives the possi- 
bility to explain easily the major known techniques in sources separation as special 
cases. Then we propose new methods based on maximum a posteriori (MAP) 
estimation, either to estimate directly the sources, or the mixing matrices or even 
both. 
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1. Introduction 

The simplest model for a source separation is 

x{t) = As{t), (1) 

where A is a mixing matrix, s(t) is a vector of sources and x{t) a vector of 
independent measurements. The main task is then to recover s{t), but one 
may instead be interested in recovering a separating matrix B such that 
s{t) = B x{t). When A is invertible, it is natural to assume that B = A^^ 
or S = S A A^^ where XI is a permutation matrix and A a diagonal scaling 
matrix. 

Many source separation algorithms have been recently proposed based 
on likehhood |, |, |, |, |, |, i, contrast function |, 0, ||], estimat- 
ing function |l5|, information theory ^, |l8|, |l9[, and more 
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generally on principle component analysis (PCA) [pO|, Independent fac- 



tor analysis (IFA) 22, 23| and independent component analysis (ICA) 



|T|, pi |25|. All these methods assume that the mixing matrix A is invert- 
ible and mainly search for a separating matrix B such that the components 
of y{t) = B x{t) be independent. This means that all these methods im- 
plicitly assume that the sources s{t) are independent. This may not be 
the case in some applications. However, the main differences between these 
methods are in the way they try to insure this independence. 

— Maximum likelihood (ML) techniques use directly the independence 
property by assuming 

P{S) = X{P^{S^) (2) 
i 

and as a result 



or equivalently 

p{x\B) = \det{B)\Y[p, {[Bx]i) = \det{B)\l[pi{yi), (4) 

i i 

where yi = [A^^x]i = [Bx]i and where p{s) is the probability density 
function of the source vector s. The ML estimate of the separating 
matrix is defined as 

B = argmax{logp(a;|B)} 
B 

= argmax < Vri(yi) + log |det(S)| } with rj(yi) = logpi(yj). 
B [ i J 

A great number of algorithms have been proposed to perform this 
optimization [17, |^. 

Infomax techniques use the entropy of y = Bx as a measure of inde- 
pendence H, |2^, H, |2|, 0: 

S = -^Pi{yi)logpi{yi). (6) 



Thus is a function of the separating matrix B and one tries to 
optimize S with respect to B. 
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M-estimation techniques define an estimate for the separating matrix 
B such that 

^Y.Hiy{t)) = [0] with yit) = Bxit), (7) 

t 

where it is assumed to have T independent observations {x{l), . . . , x{T)} 
and where H is an appropriately defined matrix valued function. [0] 
represents a matrix whose elements are all equal to zero. We can note 
that M-estimate methods generalize the ML estimation method since 
the latter can be obtained by taking 

i^to) = «^. (8) 

Contrast function minimization techniques are based on the optimiza- 
tion of a contrast function c{y) = c (Bx) which takes its extremal 
value when S is a separating matrix ||9|, |lO|. Typical examples are 
the contrast functions measuring, in some way, the independence of 
the components of y, sometimes subject to the constraint that y{t) be 
spatially white 

Wyit)y^it) = I. (9) 

t 

Higher order statistics (HOS) techniques try to insure the indepen- 
dence of the components of y by minimizing, under the whiteness 
constraint, a contrast function related to the statistics of the order 



greater that two such as the cumulants |3l|, |32|, ^] . 

The main limitations of these techniques are the following: 

- None of these techniques consider the possible errors on the model or 
the measurement (sensor) noises; 

- All these methods assume that the mixing matrix A is invertible and 
cannot account for the cases in which A is rectangular (number of 
sensors different from the number of sources). 

- All these methods assume that the sources are independent. Some 
assume the sources to be also temporally white. 

Recently, a few works using the Bayesian approach have been presented 



to push farther the limits of these methods |3^, 35, 37, 27, 27, 38, |39|] . 

In the following, we first present the basics of the Bayesian approach, 
then we show how some of the preceeding techniques can be obtained as 
special cases, and finally, we propose new ideas to account for spatial cor- 
relation between neighbor sources or time correlation of the sources. 
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2. Bayesian approach 

The main idea in tlie Bayesian approach is to use not only the hkeUhood 
p • • • , x{T)\A) but also some prior knowledge about the sources s and 

the mixing matrix A through the assignment of prior probabilics p[s) and 
p{A). Then, noting xi„t = {x{l), x{T)} and si„t = {s(l), • • • , s{T)} 
and using these direct probability laws we determine the posterior law 

logp {A, si„t\xi„t) = iogp {xi„t\A, si„t) + logp(A) + Tlogp{s) + cte, 

(10) 

where we assumed the independence of the sources s and the mixing matrix 
A. 

Prom this posterior probability law we can deduce any inference about 
A and s. For example, we can estimate both A and s by a joint maximum 
a posteriori (JMAP) criterion using an alternate maximization algorithm. 
We can also focus on the estimation of the mixing matrix A by marginal- 
izing this posterior law with respect to s to obtain p{A\xi,,t) and use the 
resulting MAP criterion to estimate A. Finally, we can integrate A from 
the joint law to obtain p{si,,t\xi..t) and estimate the sources from this 
marginal posterior law. 

Now, before going further in details of these three methods, we are going 
to illustrate some special cases which result in some classical techniques. 

2.1. Exact invertible model and independent sources 

If we assume that the model x = As is exact and that there is not 
any measurement noise and that the mixing matrix A is invertible and 
well conditioned, then we can only look for a separating matrix B = A~^. 
Indeed, as in conventional methods, if we assume that the sources s are 
independent, we have the following relations: 

p{s) = l[p^{s^) (11) 

i 

and so 

pix\B) = \detiB)\]lpii[Bx]i), (12) 

i 

where Pi{si) is the probability density function of the source component i. 
Using these relations, and noting by y{t) = B x{t) we have 



\ogp {B\xi„t) = logp {xi„t\B) + logp(S) + cte, (13) 
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where 



log p{xi..t\B) = T log \det{B)\+Y,T.^og Pi {yi{t)) 



and p{B) is a probabihty distribution on the separating matrix B. Here 
we assume that we can assign a probabihty law p{A) to the mixing matrix 
A or equivalently p{B) to the separating matrix B to translate any prior 
knowledge we have about (or we wish to impose to) them. For example, 
we may know (or assume) that the mixing matrix is such that 



A 



(14) 



k I 



for some e; or we may wish that the separating matrix B be such that its 
determinant |det(-B)| ^ and not very far from one. In the first case we 
can choose 



1 



p{A) oc exp 
and in the second case 
Some other possibilities are: 



exp 



fc I 



p{B) OC |det(S)|. 



(15) 
(16) 



p{A) oc exp 



2a2 



11- Al 



exp 



1 

'2^ 



E(l - + E"l' 



(17) 



which tries to impose \ak,i\ ~ 1, A; = / and \ak^i\ — 0, k ^ I; 



p{A) oc exp 



1 



1/ - AA 



t||2 



-— E(i-ii«Mf)'-E[^^*]'v 

k Ij^k 

(18) 

when the number of sources is less than the number of the sensors; and 



1 



p{A) oc exp 



-At||/-A*A||2 



exp 



1 

'2^ 



Ed 



a*,/ 



|2\2 



kf^l 



(19) 

when the number of sources is greater than the number of the sensores. 
These two last expressions have been proposed and used by Knuth [5^]. 
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Other choices based on prior knowledge of the geometrical positions of the 
sources and receivers and knowledge of the signal propagation law for an 



acoustical application have been used by |35 



Now, if we consider the MAP estimation, the MAP criterion to optimize 
becomes 



JiB) =logp{B\xi„T) =Tlog\detiB)\+J2Y.^ogpi{yi{t))+logp{B)+cte. 

(20) 



t i 



Searching now for the MAP solution, the necessary condition is 
dJ{B) 



dB 



[0]^-^/f(z/(t)) = [0] 



(21) 



where H \s a. matrix valued function given by 
d 



H{y) 



dB 



log Pi (y,) + log |det(B)| + - logp(B) 



(22) 



As an example, consider a uniform a priori law for B. Then we obtain 
the classical ML estimate which satisfies 



J2H{y{t)) = [0] with J/ (y) = 0(y) y* - I, 

t 

where 0(y) = [(^i(yi), • • • , 4>n{yn)Y with 



(23) 



Pi{z)' 



(24) 



One can add some extra constraints to this optimization. For example, 
we can optimize the MAP criterion subject to the constraint ^ J2t vi^) — 
I which leads again to 

J2 H {y{t)) = [ ] with H (y) = a{y y' - I) + P {(t>{y) y' + y (t>\y)) ■ 

(25) 

Note that in all these relations, (pii^) is related to the probability dis- 
tribution of the source number i. The following table gives the expression 
of this function for a few known cases. 
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Gauss 


p{z) 


oc exp 


—az^ 




0(z) 


= 2az 


Laplace 


p{z) 


oc exp 


—a 


\A] 








= asign(z) 


Cauchy 


p{z) 




1 












^1 + 


{z/af 








Gamma 


p{z) 


oc exp [- 


-I3z] 




<P{z) 


= —a/z + (3 


sub-Gaussian law 


p{z) 


oc exp 


1 2 


sech^(z) 


Hz) 


= z + tanh(2;) 


Mixture of Gaussians 


p{z) 


oc exp 
+ exp 


-\{z-a)\ 
-\{z + af 






= az — atanh(a2:) 



Remark: 

H{y) in equations (^TJ) and ( [2^ ) corresponds to the gradient of MAP and 
ML criteria. A common technique to obtain the MAP or the ML solutions 
is then to use a gradient based algorithm such as 

_B(fc+i) = _bW _ ^H{y) (26) 

where {k) and {k + 1) stand for two successive iterations in static case or 
two successive time instants for dynamic case. This equation forms the 
main body of a great number of neural network (NN) based algorithms for 
source separation. 

2.2. Accounting for errors 

Here we relax the inversibility of the matrix A and take also account 
of the errors on the data. As an example, we consider the case where the 
errors can be modelled by an additive term e{t): 

x{t) = As{t) + e{t), t = l,...,T. (27) 

We assume also that we can assign a probability law p{e) to e. In general, it 
is natural to assume that e(t) has independent components and is centered 
and white, i.e. 

\ogp{e{l), e(T)) =J2T. log^*^ • (^8) 

t i 

From this assumption, we obtain 

logp{xi„T\A, si„t) = E E 9^ (^^(*) - i^^Ut)) (29) 

t i 
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with qi{.) = logpi(.). 

Now, we can give the expression of the posterior law which is 

\ogp {A, si.,t\xi,.t) = logp (a;i..r| A, si..r) + logp(si..T) + logp(A) + cte 

= J2J2^i (^^(*) ~ i^^W)) + logp(si..T) + logp(A) + cte. 



(30) 



As mentioned before, from here, we can go in at least three directions: 

— First integrate p {A, si,,t\xi..t) with respect to A to obtain p(si..T|a;i..T) 
and estimate Si..r by 

S1..T = argmax{p(si..T|a;i..r)} ■ (31) 

Sl..T 

— Second integrate p (A, Si„t\xi..t) with respect to Si..t to obtain p(A|cci..t) 
and estimate A by 

A = argmax{p(A|a;i t)} • (32) 
A 

But, here, when A is obtained, we still have to obtain B = A and 
A may not be invertible. 

— Third, optimize p {A, si,,t\xi..t) simultaneously with respect to both 
S1..T and A by using an alternating optimization procedure such as 



s\ rp = argmax|p yA , ^l.tI^^l.t j j 

A*^*^^ = argmaxjp (^A, s^^^^'*|£Ci..t)| 
A 



(33) 



In the two first cases, the integrations can be done analytically only in 
the Gaussian case. We then obtain closed form expressions for the solutions. 

In any case, before applying any optimization, we have to ensure that 
the criterion to be optimized has at least an optimum and that this optimum 
is unique. 

2.3. Spatially independent and white sources 

The case where we can assume that the sources are independent and 
white is the simplest one. We have: 

logp(si..T) = Y.Y. ^ji^ji't)) 
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and 



t i 



t j 



(34) 

with Zi = [As]i. Then, we can omit the time summation. To simphfy the 
details of the derivations, let first assume 



p{A) oc exp 



2a? " " 



exp 



(35) 



Later, we will also consider other possibilities such as ([T7|), ([T8|) or (p^). 



Joint MAP estimationFiTst we consider the joint estimation of A and s 
where the alternating optimization algorithm becomes 



arg max 
s 



arg max < ^ q, 
A [ i 



J2qi {xi- Zi) +Y^rj{sj) 

^ EE«H 



I Xi Zi I 



The solution at each iteration has to satisfy 
d 



dsj 
d 



da 



ttij q'i {xi - Zi) + rj{sj) = 

i 

'( _ 



(36) 



(37) 



These equations are in general nonlinear and depend on the expressions of 
q and r. One exception is the particular case of Gaussian laws 



p,(n) ~A/-(0,a2) 



2ai ai 



and 



p^{s)^M{Q,al)^rj{s) 



2aV 



r'As) 



where we obtain two sets of linear equations to solve for Sj and aij : 
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<7; 



^Oij {xi - [As]i 



s-i = 



2 "J 



^2 ^"^^ [-'^*]*) 2^^j 



cr. 



^aij {xi - [As]i) - Xsj = 

i 

Sj {xi - [As]i) - iittij = 



(38) 

with A = cr^/o-g and n = ol/al. 

These two equations have to be solved in each iteration of alternating 
optimization procedure. Two strategies can be used : 

— Solve these equations for each Sj and then for each aij at each iteration: 



Si ^ij 



a 



■3*\ 



(39) 



with Xi = J2kj^j dikSk-, and ||aj* |p = Yl,j o-ij- This is a single coordinate- 
wise gradient descent based algorithm. 
— Solve these equations for all sj and then for all aij at each iteration: 



, I « = (A*A + AJ)-iA*a; 

A{x-As)-Xs = \ ^ ^ ^s\ss' + III)-' 

{x-As)s^-nA = ^ 



xs 



ss* 
Ws+jl 



(40) 

This is a bloc coordinate-wise gradient descent based algorithm. 
Remark 1: 

These two last closed form expressions give us the possibility to discuss 
the convergency of the joint MAP algorithm in the considered Gaussian 
case. We may immediately note that A obtained by this algorithm is not 
invertible. This means that, in the Gaussian hypothesis, this algorithm 
does not really separate the signals. Actually, we could remark this from 
the expression of the joint criterion in this case which is 

J{A, s) = - logp(A, s\x) = \\x - Asf + X\\sf + fi\\Af + cte. (41) 

As we can see, in this case, J{A, s) is a quadratic function of s for given A 
and a quadratic function of A for given s, but it is a biquadratic function of 
both A and s. This symctry property means that the joint MAP solution in 
this case is not unique. This criterion may even have an infinite equivalent 
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optima. The proposed iterative algorithm may then converge to any of 
these solutions depending on the initialization. 

Unfortunately, with any non Gaussian hypothesis, we can not obtain 
any closed form solution and the existance and the uniqueness of a global 
optimum is very hard to study. However, we can always propose either a 
fixed point or a gradient descent based algorithm to compute them numer- 
ically. For example, if we assume a Gaussian law for the noise, but non 
Gaussian prior laws for s and for A, we have 



J{A,s) = — logp{A,s\x) 



X 



As\ 



Xcl){s) + niP{\\A\\) + cte. (42) 



where <p{s) oc — logp(s) and ip{s) oc — logp(A). Note that the choice 
of these prior laws is then important if we want to eliminate the above 
mentionned symetry property and to be able to find a unique solution to 
the problem. Then, a gradient based algorithm writes: 



5(^+1) 



d 



/3W |^(A^'\3W) 



(43) 



where a^^^ and P^^'^ are two step parameters which can be either constant 
(fixed step gradient) or adaptive during the iterations (k). Replacing for 
the gradient expressions we obtain: 



2s W (a; 



(44) 



with SW = A^'^3W. 

A fixed point based algorithm writes: 



ll(SW) 
dA^^ > 



-1 

A 



s^^Hx - X 



x-x^^> 



(45) 



One can make the comparison with different neural network based algo- 
rithm. 



Remark 2: 

In the Gaussian hypothesis case, if we use the prior law (|1 



we obtain 
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similar closed form expressions equivalent to p9| ) 
~ A+ lla- l|2 

/\ -r ||u,j^ II (46) 
+ ^ ■^i ~ ^ 



Remark 3: 

The main interest of this approach is that we can, at least in theory, account 
for the existance of any correlation between Sj and ,8,^ or to model the 
temporal behavior of any source Sj{t), for example, via a markov model. 
We can also account for any prior information we may have about the 
mixing matrix A or impose any desired structure for the separating matrix 
B. For example, if we know that the sources are labelled in such a way that 
the sensor Xi is closer to the sources Sj, and Sj+i than to any others, 
we can use it by choosing a prior probability law 



p[A) oc exp 



(47) 



with 



Wi 



1 if i = j 

l/(2|i-j + l]) if i^j 



(48) 



or Wii = 1, Wi/i-i = Wi-i^i = a and (1 — a) for all the other coefficients Wij 
for some 0.5 < a < 1. Then the equations (|39|) and (|40|) become 



X/i ^ij (-^i 



Xi 



A + ||aj*|| 

(,Xi Xij 



(49) 



and 



s 
A 



{A^A + Xiy^A^x 



(50) 
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Marginal MAP estimationsNow, we consider the two other approaches of 
marginal MAP estimations. First we note that we can rewrite x = As 
with A is a (m X n) matrix as 



X = As = Sa 



(51) 



where S a (m x mn) bloc Toeplitz matrix and a a vector of dimension mn 
obtained by pilling up all the rows of the matrix A: 



/s* 



•• 

V 



and a = ( ai* a2* 



am*Y (52) 



To be able to obtain closed form expression, in the following we consider 
only the Gaussian case: 



p{x\A,s) oc exp 

p{A) oc exp 

p{s) oc exp 

which gives 



1 
1 



\\x - AsW"^ 
HA 



with i/;{A) = \\Af 
with (p{s) = ||s|p 



p{A, s\x) oc exp 



-^J(A,.) 



(53) 
(54) 
(55) 

(56) 



with 



J{A,s) = \\x-Asf + X(l){s)+i^2P{A) = \\x-Asf + X\\sf + n\\Af{57) 

= \\x - Saf + X(j)(S) + iixp(a) = \\x - Sa\f + —\\Sf + i^llaf (58) 

m 

Note that, in this case, J{A, s) is a quadratic function of s for fixed A and 
a quadratic function of A for fixed s, but it is a bilinear function of both 
A and s. This remark means that the joint MAP solution in this case is 
not unique. 

Note also that J{A, s) can be rewritten as 



J{A,s) = {s -sfPg {s-s)-s^Pg s + x^x + i^\\A\ 



(59) 
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with 



or as 



s = (A*A + Xiy^A^x and Ps = {A* A + A/)"^; 



J(a, 5) = (a - a)*P„ \a - a) - a*P„ 'a + a;*a; + A||5 



-1. 



(60) 



with 



a 



{S^S + fiiy^S^x and = (5*5 + /i/)-^ 



With these notations, it is then easy to obtain the expression of the marginals 
laws p{s\x) and p(A\x): 



-lnp(A|a;) oc -lndet(P^ ) - J{A,s) 



= -In 

— liip[s\x) (X —In 
= -In 



det(A*A + AJ) - x\x - As) + n\\Af (61) 

det(P;') I -J(a,s) 

det(5*5 + /xl) -x\x-Sa) + \s^s (62) 



In non Gaussian hypothesis for s and a where — \np{s) = </)(s) + cte and 

— Inp(a) = ip{a)+cte, we can always use the Laplace approximation of the 
posterior laws. Then, we can use the first lines of these two last equations by 
replacing det(-P5 ) and det(-Pjj ) by, the jacobians of their respective log 
probability densities: V|J(A, s) and VaJ(a, S) computed for the MAP 
estimates s = argmin^ {J{A, s)} and a = argmin^j {J{a, S)}. 

The marginal MAP solutions of A and s respectively satisfy ^^'^P(-^i^) = 

and g^g ^'^'^ = 0. Note that, excepted the Gaussian case where these 
equations have analytical solutions, we need a numerical optimisation algo- 
rithm to compute the solutions. For example, the marginal MAP estimate 
of A can be computed by the following iterative algorithm: 



,(fe) 



= arg max < In 
A ^ 

= arg max < In 
A ^ 



+ J(A,s 



det(A*A + AJ)| + a;*As('=-i) + /x^(A)} 



(63) 



where 



i^*^^ = arg min I J{A^^ '^'^ , s] 



arg mm 
s 



(64) 
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In general, neither of these equations have exphcite solutions and we 
have to do the optimization numerically. For example, a gradient based 
algorithm to compute the marginal MAP estimate of A writes: 

As('=) oc A*''"'\a; - A^'^^^^s) + A(/)'(s) ^^^^ 
AA^''^ oc A*(A*A + A/)-i + a;S(*--i) + ^'0'(A) 



2.4. Spatially correlated sources 

As a first extension we still assume that the sources are white, but they 
are spatially correlated, i.e. 

logp(si..T) = J2 ^(•^i • • • > SN{t)) 

t 

where r(si, . . . ,sn) represents the joint probability law of the sources. Then 
we obtain 

logp (A, si„t\xi..t) = YjYj^^i (^»(*) ~ y^W) + • • • ' SN{t))]+logp{A)+cte. 

t i 

(66) 

But the main difficulty here will be the modeling of these dependencies and 
simplification of the expression of the joint probability law r(si, . . . , sat). 
For example, if the sources are labelled in such a way that only the imme- 
diate neighbor sources are correlated, then we can use a first order Markov 
model and write 

r(si, . . . ,SAr) = ^r{sj\sj-i,Sj+i) (67) 

However, the complexity of the optimization algorithms depend on the 
expression of the potential function r(sj|sj_i, sj+i). A simple case is a 
Gaussian model where 



with D a tri-diagonal Toeplitz matrix with diagonal elements equal to 2 and 
off-diagonal elements equal to -1. In this case, the equations (40) become 

{ A\x- As)-\D^Ds = {s = {A*A + \D^D)-^A*x 

I (a; - As)s* - /iA = ^ | A = £cs*(ss* + /x/)"^ 



(68) 



16 



A. Mohammad-Djafari 



2.5. Spatially independent but colored sources 

Here we assume that the sources are mutually independent but that 
they are temporally colored, i.e. 

j 

where rj{sj{l), . . . , Sj{T)) represents the joint probabihty law of the differ- 
ent samples of source number j. Then we obtain 



logp(A,si..r|a;i..r) = JllZ^i - + l]^j(si(l)> • • • 

t i j 

+ J2T.4l + cte. (69) 

k I 

Here again, the main difficulty is the modelization and simplification of the 
expression of the joint probability laws rj{sj{l), . . . , Sj{T)). For example, 
we can use a first order markov chain model and write 

rj{sj{l),...,s,{T)) = J2rjisjit)\s,{t - 1)) (70) 

t 

As an example here we consider the Gaussian case (or equivalently first 
order AR models) : 

rjisjil), . . . , s,iT)) = -J2a,is,it) - s,it - l)f (71) 

t 

With this assumption we have 



s[% = argmax<j^^gi(a:;i(t)-yi(t))-^^Q!j(sj(t)-Sj(t-l))^ 
^ \ t i t j 

' ^''^ arg max l^Y^qi {xi{t) - yi{t)) + -f X] f 

(72) 

The solution at each iteration has to satisfy 



2 * ' (73) 



E ii (^^(*) - (*)) -EE i^At) - sj{t - 1)) = 

t i t j 

2 

t 

For the special case of Gaussian noise we obtain 
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' EE«ii - [M^it)) - EE2Ai(sjW - s,{t - 1)) 

t i t j 

E^iW {xi{t) - [As]i{t)) + 



with Aj = OLjol and 

The two algorithms of (|39|) and (pO|) in this case become: 



and 



s(t) 
A 



(A*A + A/)-i [diag {Ai, . . . , A„} s(t - 1) + A^x{t)\ 








(74) 



(75) 



(76) 



Here, we conclude the presentation of the Bayesian approach to source 
separation. Beside some of the details of implementation, we showed that 
the Bayesian approach gives us the possibility to push further some of the 
limitations of the classical techniques in source separation. In the next sec- 
tion we give a few preliminary numerical results to show the performances 
of the proposed algorithems. 



3. Simulation results 

In the following we give a few preliminary examples of simple source sep- 
aration problem to show some performances of the proposed methods. In 
all these examples, we used the following algorithm: 



y = {A^A + Xiy^A^x 
s = g{y) 

AA oc A\A^A + \I)'^ + XS + ^ip'{A) 
with A = /i = .1, = 100 and appropriate g. 

3.1. Example 1 

Hier, we considered two sources 



(77) 



si{t) = sin(500t-M0cos(50t)) 
S2{t) = sin(300t) 



, i = [0 : .001 : .499]. 



(78) 
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and used the mixing matrice A = |^ ^ 1 y ^° obtain the two set of 

data xi{t) and X2{t). Then we apphed the algorithm given in (^). The 
following figures show the obtained results si(t) and S2{t). 
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Fig. 2: Results of the source separation in Example 1: phase space distribution 
of sources, mixed signals and separated sources 



A Bayesian approach to source separation 



19 




1 O 1-2 O 2-1 O 1 



Fig. 3: Results of the source separation in Example 1: histograms of sources, 
mixed signals and separated sources 

The two sources are well separated. 

3.2. Example 2 

Hier, we considered three sources 

si{t) = sin(500f + 10cos(50t)) 
< S2{t) = sin(300t) , t = [0 : .001 : .499]. (7! 

ss{t) = sign(cos(120i - 5cos(50i))) 

and used the mixing matrice 





to obtain the three set of data xi(t), X2{t) and 3:3 (t). The following figures 
show the obtained results. 



20 



A. Mohammad-Djafari 



X2{t) 
X3{t) 















j — mf^ 








Fig. 4: Results of the source separation in Example 2. 
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Fig. 5: Results of the source separation in Example 2: phase space distribution 
of sources, mixed signals and separated sources 
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si x1 sh1 




-1 1 -2 2 -1 1 

Fig. 6: Results of the source separation in Example 2: histograms of sources, 
mixed signals and separated sources 



Here also the three sources are well separated. 



3.3. Example 3 

Hier, we considered the two sources of the first example, but we simu- 
lated the case where there are three receivers using the mixing matrice 




to obtain the three set of data xi{t), X2{t) and xs{t). Then we applied 
again the algorithm given in (|7^). The following figures show the obtained 
results. 
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'si{t) 

< 

'xi(i) 

X2{t) 
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Fig. 7: Results of the source separation in Example 3. 
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Fig. 8: Results of the source separation in Example 3: phase space distribution 
of sources, mixed signals and separated sources 
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Fig. 9: Results of the source separation in Example 3: histograms of sources, 
mixed signals and separated sources 



3.4. Example 4 

Hier, we considered the three sources of the Exammple 2 and simulated 
the case where there are only two receivers using the mixing matrice 



1. .2 1 
-.5 1. .2 



to obtain the two set of data xi{t) and X2{t). Then we applied again the 
algorithm given in (|7^). The following figures show the obtained results. 
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Fig. 7: Results of the source separation in Example 4. 
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Fig. 8: Results of the source separation in Example 4: phase space distribution 
of sources, mixed signals and separated sources 
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Fig. 9: Results of the source separation in Example 4: histograms of sources, 
mixed signals and separated sources 



4. Conclusions 

We investigated the use of the Bayesian estimation theory to source sepa- 
ration and showed that this approach has the potential to push farther the 
limits of the classical methods. This work is not really yet finished. We are 
going now to compare the performances of the proposed methods to other 
conventional ones on simulated and real data. 
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